Some Lower and Upper Bounds for Tree Edit Distance
نویسندگان
چکیده
In this report I describe my results on the Tree Edit Distance problem [13, 27]. The edit distance between two ordered rooted trees with vertex labels is the minimum cost of transforming one tree into the other by a sequence of elementary operations consisting of deleting and relabeling existing nodes, as well as inserting new nodes. Tree Edit Distance has applications in many fields such as computer vision, computational biology and compiler optimization. I describe an algorithm that computes the edit distance between two trees of sizes n and m, where m < n, and runs in O(nm(1+log n m )) = O(n) time and O(nm) space. The previously best known algorithm for this problem, which is due to Philip Klein [22], runs in O(mn log n) = O(n log n) time and O(mn) space. Next, a matching lower bound is proved for the family of decomposition strategy algorithms, which includes the previous fastest algorithms for this problem. The best previously known lower bound for this family was Ω(n log n). Finally, I describe recent results on the Longest Common Subtree problem. This is an interesting special case of Tree Edit Distance in which only insertions and deletions are considered (i.e., the cost of all relabeling operations is infinite, and the cost of any insertion or deletion is 1). I describe a few algorithms for this problem, the fastest of which runs in O(Lr log r log log m), where L is the size of the LCS (L ≤ m) and r is the number of pairs of vertices with matching labels, one from each tree (r ≤ nm). These algorithms combine techniques from sparse string LCS (Longest Common Subsequence), with Tree Edit Distance algorithms. The tree edit distance paper [13] is a joint work with Erik Demaine, Benjamin Rossman and Oren Weimann. The longest common subtree paper [27] is a joint work with Dekel Tsur, Oren Weimann and Michal Ziv-Ukelson.
منابع مشابه
Tree Edit Distance Cannot be Computed in Strongly Subcubic Time (unless APSP can)
The edit distance between two rooted ordered trees with n nodes labeled from an alphabet Σ is the minimum cost of transforming one tree into the other by a sequence of elementary operations consisting of deleting and relabeling existing nodes, as well as inserting new nodes. Tree edit distance is a well known generalization of string edit distance. The fastest known algorithm for tree edit dist...
متن کاملEstimating Upper and Lower Bounds For Industry Efficiency With Unknown Technology
With a brief review of the studies on the industry in Data Envelopment Analysis (DEA) framework, the present paper proposes inner and outer technologies when only some basic information is available about the technology. Furthermore, applying Linear Programming techniques, it also determines lower and upper bounds for directional distance function (DDF) measure, overall and allocative efficienc...
متن کاملGreedy Construction of DNA Codes and New Bounds
In this paper, we construct linear codes over Z4 with bounded GCcontent. The codes are obtained using a greedy algorithm over Z4. Further, upper and lower bounds are derived for the maximum size of DNA codes of length n with constant GC-content w and edit distance d. keywords: DNA codes, GC-content, edit distance, upper and lower bounds.
متن کاملAn Exact Graph Edit Distance Algorithm for Solving Pattern Recognition Problems
Graph edit distance is an error tolerant matching technique emerged as a powerful and flexible graph matching paradigm that can be used to address different tasks in pattern recognition, machine learning and data mining; it represents the minimum-cost sequence of basic edit operations to transform one graph into another by means of insertion, deletion and substitution of vertices and/or edges. ...
متن کاملSharp Upper bounds for Multiplicative Version of Degree Distance and Multiplicative Version of Gutman Index of Some Products of Graphs
In $1994,$ degree distance of a graph was introduced by Dobrynin, Kochetova and Gutman. And Gutman proposed the Gutman index of a graph in $1994.$ In this paper, we introduce the concepts of multiplicative version of degree distance and the multiplicative version of Gutman index of a graph. We find the sharp upper bound for the multiplicative version of degree distance and multiplicative ver...
متن کامل